We propose JFP, a Joint Future Prediction model that can learn to generate accurate and consistent multi-agent future trajectories. For this task, many different methods have been proposed to capture social interactions in the encoding part of the model, however, considerably less focus has been placed on representing interactions in the decoder and output stages. As a result, the predicted trajectories are not necessarily consistent with each other, and often result in unrealistic trajectory overlaps. In contrast, we propose an end-to-end trainable model that learns directly the interaction between pairs of agents in a structured, graphical model formulation in order to generate consistent future trajectories. It sets new state-of-the-art results on Waymo Open Motion Dataset (WOMD) for the interactive setting. We also investigate a more complex multi-agent setting for both WOMD and a larger internal dataset, where our approach improves significantly on the trajectory overlap metrics while obtaining on-par or better performance on single-agent trajectory metrics.
translated by 谷歌翻译
预测道路用户的未来行为是自主驾驶中最具挑战性和最重要的问题之一。应用深度学习对此问题需要以丰富的感知信号和地图信息的形式融合异构世界状态,并在可能的期货上推断出高度多模态分布。在本文中,我们呈现MultiPath ++,这是一个未来的预测模型,实现了在流行的基准上实现最先进的性能。 MultiPath ++通过重新访问许多设计选择来改善多径架构。第一关键设计差异是偏离基于图像的基于输入世界状态的偏离,有利于异构场景元素的稀疏编码:多径++消耗紧凑且有效的折线,直接描述道路特征和原始代理状态信息(例如,位置,速度,加速)。我们提出了一种背景感知这些元素的融合,并开发可重用的多上下文选通融合组件。其次,我们重新考虑了预定义,静态锚点的选择,并开发了一种学习模型端到端的潜在锚嵌入的方法。最后,我们在其他ML域中探索合奏和输出聚合技术 - 常见的常见域 - 并为我们的概率多模式输出表示找到有效的变体。我们对这些设计选择进行了广泛的消融,并表明我们所提出的模型在协会运动预测竞争和Waymo开放数据集运动预测挑战上实现了最先进的性能。
translated by 谷歌翻译
Small to medium-scale data science experiments often rely on research software developed ad-hoc by individual scientists or small teams. Often there is no time to make the research software fast, reusable, and open access. The consequence is twofold. First, subsequent researchers must spend significant work hours building upon the proposed hypotheses or experimental framework. In the worst case, others cannot reproduce the experiment and reuse the findings for subsequent research. Second, suppose the ad-hoc research software fails during often long-running computationally expensive experiments. In that case, the overall effort to iteratively improve the software and rerun the experiments creates significant time pressure on the researchers. We suggest making caching an integral part of the research software development process, even before the first line of code is written. This article outlines caching recommendations for developing research software in data science projects. Our recommendations provide a perspective to circumvent common problems such as propriety dependence, speed, etc. At the same time, caching contributes to the reproducibility of experiments in the open science workflow. Concerning the four guiding principles, i.e., Findability, Accessibility, Interoperability, and Reusability (FAIR), we foresee that including the proposed recommendation in a research software development will make the data related to that software FAIRer for both machines and humans. We exhibit the usefulness of some of the proposed recommendations on our recently completed research software project in mathematical information retrieval.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
关于使用ML模型的一个基本问题涉及其对提高决策透明度的预测的解释。尽管已经出现了几种可解释性方法,但已经确定了有关其解释可靠性的一些差距。例如,大多数方法都是不稳定的(这意味着它们在数据中提供了截然不同的解释),并且不能很好地应对无关的功能(即与标签无关的功能)。本文介绍了两种新的可解释性方法,即Varimp和Supclus,它们通过使用局部回归拟合的加权距离来克服这些问题,以考虑可变重要性。 Varimp生成了每个实例的解释,可以应用于具有更复杂关系的数据集,而Supclus解释了具有类似说明的实例集群,并且可以应用于可以找到群集的较简单数据集。我们将我们的方法与最先进的方法进行了比较,并表明它可以根据几个指标产生更好的解释,尤其是在具有无关特征的高维问题中,以及特征与目标之间的关系是非线性的。
translated by 谷歌翻译
不确定性的量化对于采用机器学习至关重要,尤其是拒绝分布(OOD)数据回到人类专家进行审查。然而,进步一直很慢,因为计算效率和不确定性估计质量之间必须达到平衡。因此,许多人使用神经网络或蒙特卡洛辍学的深层集合来进行相对最小的计算和记忆时合理的不确定性估计。出乎意料的是,当我们专注于$ \ leq 1 \%$ frese-falds正率(FPR)的现实世界中的约束时,先前的方法无法可靠地检测到OOD样本。值得注意的是,即使高斯随机噪声也无法触发这些流行的OOD技术。我们通过设计一种简单的对抗训练计划来帮助缓解这个问题,该计划结合了辍学合奏所预测的认知不确定性的攻击。我们证明了这种方法可以改善标准数据(即未经对抗制作)上的OOD检测性能,并将标准化的部分AUC从近乎随机的猜测性能提高到$ \ geq 0.75 $。
translated by 谷歌翻译
鉴于完整的指纹图像(滚动或拍打),我们介绍了Cyclegan模型,以生成与完整印刷相同身份的多个潜在印象。我们的模型可以控制生成的潜在打印图像中的失真,噪声,模糊和遮挡程度,以获得NIST SD27潜在数据库中介绍的好,坏和丑陋的潜在图像类别。我们的工作的贡献是双重的:(i)证明合成生成的潜在指纹图像与NIST SD27和MSP数据库中的犯罪现场潜伏期的相似性,并由NIST NIST NFIQ 2质量度量和由SOTA指纹匹配器和ROC曲线评估。 (ii)使用合成潜伏期在公共领域增强小型的潜在训练数据库,以提高Deepprint的性能,Deepprint是一种SOTA指纹匹配器,设计用于在三个潜在数据库上滚动的指纹匹配(NIST SD27,NIST SD302和IIITD,以及IIITD,以及IIITD,以及IIITD,以及-slf)。例如,随着合成潜在数据的增强,在具有挑战性的NIST SD27潜在数据库中,Deepprint的排名1检索性能从15.50%提高到29.07%。我们生成合成潜在指纹的方法可用于改善任何潜在匹配器及其单个组件的识别性能(例如增强,分割和特征提取)。
translated by 谷歌翻译
实际应用程序中使用的答案集程序通常要求该程序可与不同的输入数据一起使用。但是,这通常会导致矛盾的陈述,从而导致不一致的程序。计划中潜在矛盾的原因是相互矛盾的规则。在本文中,我们展示了如何确保程序$ \ mathcal {p} $在给定任何允许的输入数据的情况下仍然是无偶数的。为此,我们介绍了解决冲突的$ \ lambda $ - 扩展名的概念。解决冲突规则$ r $的解决冲突的$ \ lambda $ - 是(默认)文字的设置$ \ lambda $,使得将$ r $的$ r $ ty $ \ lambda $延长到$ \ lambda $解决所有冲突$ r $的所有冲突立刻。我们调查了合适的$ \ lambda $ - 扩展应具有并在此基础上建立的属性,我们制定了一种策略,以计算每个相互冲突的$ \ lambda $ - extensions in $ \ Mathcal {p} $中的每个冲突规则。我们表明,通过实施冲突解决过程,该过程使用$ \ lambda $ extensions连续解决冲突,最终产生了一个程序,该程序在给定任何允许的输入数据的情况下仍然是非矛盾的。
translated by 谷歌翻译
人类和机器人的动态运动是由姿势依赖性的非线性相互作用在自由程度之间广泛驱动的。但是,在研究人类运动产生的机制时,这些动力学效应仍被忽略。受最近作品的启发,我们假设人类运动计划为地球协同序列,因此对应于用分段最小能量实现的协调关节运动。基础计算模型建立在Riemannian几何形状上,以说明身体的惯性特征。通过对各种人类手臂运动的分析,我们发现我们的模型片段运动转化为测量协同作用,并成功预测了观察到的手臂姿势,手动轨迹及其各自的速度曲线。此外,我们表明我们的分析可以进一步利用,以通过将单个人类协同作用作为机器人配置空间中的地球途径转移到机器人中。
translated by 谷歌翻译
当使用基于视觉的方法对被占用和空的空地之间的单个停车位进行分类时,人类专家通常需要注释位置,并标记包含目标停车场中收集的图像的训练集,以微调系统。我们建议研究三种注释类型(多边形,边界框和固定尺寸的正方形),提供停车位的不同数据表示。理由是阐明手工艺注释精度和模型性能之间的最佳权衡。我们还调查了在目标停车场微调预训练型号所需的带注释的停车位数。使用PKLOT数据集使用的实验表明,使用低精度注释(例如固定尺寸的正方形),可以将模型用少于1,000个标记的样品微调到目标停车场。
translated by 谷歌翻译